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USING EPIDEMIC PREVALENCE DATA TO JOINTLY ESTIMATE 
REPRODUCTION AND REMOVAL 

By Jan van den Broek and Hiroshi Nishiura 

Utrecht University 

This study proposes a nonhomogeneous birth-death model which 
captures the dynamics of a directly transmitted infectious disease. 
Our model accounts for an important aspect of observed epidemic 
data in which only symptomatic infecteds are observed. The nonho- 
mogeneous birth-death process depends on survival distributions of 
reproduction and removal, which jointly yield an estimate of the ef- 
fective reproduction number R(t) as a function of epidemic time. We 
employ the Burr distribution family for the survival functions and, as 
special cases, proportional rate and accelerated event-time models are 
also employed for the parameter estimation procedure. As an exam- 
ple, our model is applied to an outbreak of avian influenza (H7N7) 
in the Netherlands, 2003, confirming that the conditional estimate 
of R(t) declined below unity for the first time on day 23 since the 
detection of the index case. 

1. Introduction. The data-generating process of an epidemic has special 
characteristics to which one wants to pay particular attention when modeling 
these data. First, the observed data of an infectious disease outbreak are 
limited in the sense that the incidence — expressed as the number of newly 
infected individuals as a function of time — is usually measured by symptom 
onset of disease, which is sometimes further accompanied by reporting delay. 
Thus, all of the observed cases in the reported data represent those who 
experienced infection at some point in time in the past. Second, epidemic 
data sets do not usually include information on the number of susceptible 
individuals as a function of time, but solely records infected (interpreted as 
symptomatic) individuals. It is therefore unknown if susceptible individuals 
in the past are still susceptible at a point of time. Third, the susceptible 
population is usually not well defined at the beginning of an outbreak, and 
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its size may vary with time due to time-dependency in contact behavior 
and public health countermeasures during the outbreak. In a veterinary 
context, the countermeasures might include a transportation ban during an 
infectious disease outbreak on animal farms. Control measures are taken 
not only to reduce the number of contacts, but also to limit the ability 
of infected individuals to generate secondary cases. For instance, one may 
think of preemptive culling in the case of an infectious disease outbreak 
on animal farms. Fourth, since the infection is transmitted from individual 
to individual, observation of an infected individual is not independent of 
observing other individuals. 

These characteristics lead us to consider developing a method which ap- 
propriately captures the dynamics of a directly transmitted infectious disease 
by modeling the number of infected-and-detected individuals, preferably in 
discrete time, in order to quantify the reproduction of infected individuals 
in a nonhomogeneous manner. This contrasts with other statistical models 
which measure the population of susceptibles and model the force of infec- 
tion at which these susceptibles get infected. 

As is usually assumed, one can think of a population in which an epi- 
demic of an infectious disease occurs as consisting of three groups (or sub- 
populations) of individuals. The first is the susceptible population which 
represents individuals who have not been infected yet but may experience 
infection in the future. The second is a population of infectious individuals 
which consists of those who have been infected and are infectious to oth- 
ers. The last group consists of removed or recovered individuals who are no 
longer infectious and may be immune or are removed from the population. 
The simplest type of the model which describes the transmission dynamics 
over time is referred to as an SIR (Susceptible-Infected-Recovered) model 
[Diekmann and Heesterbeek (2000)]. Since the present study will concen- 
trate on the number of infected, here we consider their dynamics alone. 
Letting x(t) and y(t) be the susceptible and infectious fractions of the pop- 
ulation at time t, respectively, the derivative of y(t) is expressed as 

(i.i) ^y(t) = P(t)v(t)x{t)-ti(t)y{t). 

Note that the transmission rate f3(t) and the removal rate /i(t) may de- 
pend on time. If one rewrites the product of the transmission rate j3(t) and 
susceptibles x(t) as X(t) (= (3(t)x(t)), then the equation is rewritten as 

(1-2) ±y(t) = \(t)y(t)-v(t)y(t), 

which is the equation of the deterministic nonhomogeneous birth-death pro- 
cess. The function X(t) has been referred to as the reproductive power and 
can be interpreted as the rate at which a single infected individual is able 
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to generate secondary cases [Kendall (1948)]. In other words, X(t) is the 
rate at which an infected individual is able to reproduce itself. The so-called 
death rate /i(t) in a birth-death process is interpreted as the rate at which 
an infected individual is removed from the sub-population of infected indi- 
viduals. It should be noted that equation (1.2) relaxed the definition of y(t) 
compared to that in (1.1). Namely, whereas y(t) in (1.1) has to be infec- 
tious to others, we can instead regard y{t) in (1.2) as infected- and- detected 
individuals (i.e., regardless of infectiousness) . 

One of the advantages of using this simple equation is that the population 
of susceptibles is allowed to vary over time. Therefore, the reproductive 
power varies over time due to two different reasons: (1) the population of 
susceptibles x{t) varies as a function of time and (2) the transmission rate 
(3(t) is nonhomogeneous over time. In addition, the removal rate (i(t) is 
allowed to vary with time. 

It can be an advantage to model infection process stochastically, because 
one can explicitly define the probability of transmission, rather than de- 
terministically stating if the transmission happens [Andersson and Britton 
(2000)]. A stochastic model can describe not only the quantitative patterns 
of observation with time-dependent expected values, but also offer standard 
errors of the parameters without making adhoc distributional assumptions. 
More importantly, the likelihood function can be explicitly derived, which 
will be useful for statistical inference of parameters and critical assessments 
of the modeling method. Moreover, such a stochastic process can model the 
number of infected over time as being dependent. 

The present study aims to develop a stochastic model which is based on a 
nonhomogeneous birth-death process. The model is applied to an observed 
data set of infected-and-detected (but not yet removed) cases, permitting 
reasonable assessment of the time course of an epidemic. In Section 2 the 
stochastic version of the nonhomogeneous birth-death model is comprehen- 
sively described. A novel analytical solution of the model is obtained with the 
use of a general Lagrange transformation derivation of which has not been 
explicitly discussed to date. In Section 3 we discuss a conditional discrete- 
time fitting method. Although a similar conditional fitting procedure was 
employed in a recent study [van den Broek and Heesterbeek (2007)], the 
present study is the first to apply the technique to a model where both 
birth and death rates are nonhomogeneous. Depending on the model and 
the data given, the number of parameters to be estimated can be large and 
it might be difficult to get stable estimates. Besides, a relationship between 
the reproductive power and the removal rate might exist. Therefore, more 
restricted models which employ this relationship are discussed in Section 4. 
In Section 5 our model is applied to an observed epidemic data set of avian 
influenza A (H7N7) in the Netherlands, 2003. 
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2. The nonhomogeneous birth death model. The stochastic differential 
equations for a nonhomogeneous birth-death process, with Y(t) and yo being 
the number of infected-and-detected at time t and the initial number of 
infected-and-detected at time 0, respectively, are as follows: 

lp y (t) = \(t)(y - l)p y -i(t) + fi(t)(y + l) Py+1 (t) - (A(t) + »(t))yp y (t), 

^p (t)=p(t)p 1 (t), 

where X(t) denotes the reproductive power, fi(t) the death rate and Pr(Y(t) = 
y) = Py(t) the probability that the number of infected-and-detected individ- 
uals at time t is y. It should be noted that here we consider Y(t) as the 
number of infected-and-detected individuals which represents the observed 
elements of the data and is irrelevant to infectiousness. If we take the prob- 
ability generating function of the probabilities p, we can derive a partial 
differential equation for this fraction by multiplying the above differential 
equations with z y , summing the result, and taking the derivative. The ana- 
lytical solution of the partial differential equation is [Kendall (1948)] 

_ N [ 9 - (1 - 7T - 9)ul yo 
&(y,u)= \ — 

1 — TTU 

If we write p{t) = / *[//(r) - A(r)] dr = log[§4§] and 



rt ft 

(2.1) 7 (i) = / e p(r) A(r)dr = - 



we get 

(2.2) 9 = 1 

(2.3) vr = l 



dS x (r) 



o Jo Sfii T ) 

e-P® , S^t) 



l + 7 (t) e -P(t) S x (t) +1 (t)S^t)' 
1 1 S x (t) 



l + 7 (t) e -P(*) S x (t) +1 (t)S^t)- 

It should be noted that S\(t) = e~^o A ( T ) dT [ s the reproduction survival func- 
tion and Sfj,(t) = e~ fa M r ) dr i s the removal survival function. 

To obtain the probability distribution from <&(y,u), the general Lagrange 
transformation is useful (the details of which can be found elsewhere 
[Consul and Famoye (2006)]). First, let $(y,u) = [9 + (1 - 0)ii^H]fo = 
[9 + (1 — 9)ifj(u)] yo , where ip{u) is the probability generating function (pgf) 
of the geometric distribution. Second, let g(z) = 1 — tt + irz, the pgf of a 
Bernoulli distribution. Numerically, the smallest root of the transforma- 
tion z = ug(z) defines a pgf z = i/j(u) = ^~^ M [Consul and Famoye (2006)]. 
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Third, additionally considering f(z) = (6 + (1 — 8)z) yo , the pgf of the dis- 
crete general Lagrange probability distribution under the Lagrange trans- 
form z = ug(z) is given by f(z) = f(ip(u)) = [0 + (1 — 6) ^_^„" ] i/Q and, more- 
over, the probability mass function is a special case of the double-binomial 
distribution [Consul and Famoye (2006), pages 22-27], which can be referred 
to as the Bernoulli-binomial Lagrangian distribution in the terminology of 
[Johnson, Kemp and Kotz (2005)] 

P(Y(t) = 0) 
P(Y(t)=y) 

(2.4) 



where 9 and tt are defined in equations (2.2) and (2.3). It should be noted 
that g(z) and f(z) are pgf's and, thus, the necessary conditions for La- 
grange transformation are satisfied. To the best of our knowledge, detailed 
derivation of equation (2.4) has never been discussed in the context of the 
nonhomogeneous birth-death model (see Section 6). 
The expectation and the variance of (2.4) are 

E{y) 



var(y) 



by using (2.2) and (2.3). 

The expected value has two interpretations. The first part is the pre- 
dicted number of infected individuals at time to who survived removal [i.e., 
yoSfj,(t)]. The second part, s ^ , measures the rate at which a nonremoved 
infected individual reproduces itself. This is similar to an interpretation of 
a nonhomogeneous birth process [van den Broek and Heesterbeek (2007)]; 
the difference of the present study from the previous nonhomogeneous birth 
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process is that in the present setup only the predicted nonremoved infected- 

S (t) 

and-detected individuals reproduce. Second, the ratio of the rates g^ry is 
the net reproduction ratio with which an infected individual reproduces it- 
self, which is interpreted as the effective reproduction number R(t) as a 
function of epidemic time t. R(t) in the present study can be regarded as 
the average number of secondary cases generated by a single primary case at 
time t. That is, our R(t) is an instantaneous measure of secondary transmis- 
sions occurring at time t, whose definition is equivalent to the period total 
fertility rate in mathematical demography [Nishiura and Chowell (2009)]. If 
R(t) < 1, it suggests that the epidemic is in decline and may be regarded as 
being "under control" at time t [vice versa, if R(t) > 1]. It should be noted 
that the expected value of (2.4) is equivalent to an analytical solution of the 
deterministic version of a nonhomogeneous birth-death process (1.2). 

The term j(t) in the formula for the variance represents the dependence 
between the birth and death rate as can be seen from (2.1). As it is clear 
from the analytical expression for the variance, the variance becomes large 
if the probability of nonremoval is large for an infected individual, if the 
probability of reproduction is large, or both. This matches intuitive sense. 
In addition, the variance can be regarded as a type of negative binomial 
variance, if we rewrite it as var(y) = E(y)[l + 2l ~ o l E(y)]. 

3. Fitting the model. We have shown how the epidemic data can be gen- 
erated by a stochastic nonhomogeneous birth-death process. Nevertheless, 
the observed data are, in reality, just one sample path of all the possible 
sample paths that can arise from such an epidemic process. Considering fur- 
ther that the number of infected-and-detected and individuals at a certain 
point in time t depends on the number of infected-and-detected individu- 
als at some time point before t, our model is fitted to the data by con- 
ditioning on the transmission dynamics which happened before t [Becker 
(1989), Becker and Yip (1989)]. Moreover, as we briefly discussed in the 
Introduction, another important point in practice is the discrete nature of 
the time points of observation, that is, say, tj, j = 0,1, ... , n, where the time 
unit might typically be days or weeks. Therefore, the number of infecteds at 
time tj is modeled conditionally on the number of infecteds by time tj-\. 

Since the probability mass function (2.4) is a conditional probability mass 
function, conditioning being on the number of infecteds at to, this can be 
effectively used as the conditional model for the number of infecteds at time 
tj, given the number of infecteds at time tj-%. For this reason, the survival 
distributions S\(t) and S^t) and, of course, j(t), are also conditioned on 
the past. Let T\ and be the stochastic variables that measure the repro- 
duction time and the removal time, respectively. The conditional survival 
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probability for the reproduction time is 
P(T x >t j \T x >t j -i 



S x (tj-i) 

= i-P(T x e(t j _ 1 ,t j ]\T x >t J _ 1 ) 
= 1 - h\(tj-i), 

where h x (tj—\) is interpreted as the discrete reproductive power. Similarly, 
the conditional survival probability for the removal time is given by 

P(T„ > tj\Tp > tj-x) = 1 - M^-i), 

where hn(tj-x) is the discrete removal hazard. 

The discrete conditional version of (2.1) in the time interval (tj-i, tA is 

P(T x (=(t j - 1 ,t j ]\T x >t j - 1 ) _ hxjtj-!) 

p(r M >t i |r M >t i _i) i-/t M (^_i)' 

When these conditional discrete measurements are considered, 9 and ir cor- 
respond to hfj,{tj-x) and h x (tj_i), respectively. Using these, the conditional 
probability for (2.4) is expressed as 



P(Y(tj) = Olr^-i) = yu.,) = h„(tj-i) 
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The expected value of this probability is yt J ^ 1 > which is referred to 

as the sample path profile [Lindsey (2001)]. The corresponding conditional 
measurement of the effective reproduction number is R(tj-\) = \~h~$rH ; 

with V(^-i) = 1 - and Wi-i) = 1 " sfii- 

Let A be the vector of parameters from the survival distributions (see 
Section 4). The log- likelihood function is then 

n 

1(A) = £ log[P(y(t,) = y tj IFfo-i) = A)]. 
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Note that the likelihood is evaluated only for y tj > 0, because zero prevalence 
is the absorbing state of the process and this state is not observable in reality. 

The log-likelihood can be maximized using an optimization procedure, 
such as the Nelder-Mead method to find the maximum likelihood estimates. 
In our exercise, the software system R [R Development Core Team (2008)] 
is used. The information matrix is used to find the standard errors of the 
parameters, and we use Akaike's information criterion (AIC) to compare 
model fits. 



4. The Burr distribution and its special cases. To choose a particular 
form for the survival functions, one might take the early phase of the out- 
break into account. The mean of the probability mass function (2.4) depends 
on these survival functions and is the same as the solution of (1.2). Since (1.2) 
can be derived from the SIR model, one might look at the deterministic SIR- 
model to decide the parametric form of the survival function. In the early 
phase of the outbreak a deterministic SIR-model can be well approximated 
by a deterministic Si-model since in that phase the number of removals is 
limited. The dynamic equations for this Si-model hold for the fraction of 
susceptibles and for the fraction of infected and since the fraction of sus- 
ceptibles at a point of time is the same as the fraction of individuals with 
infection time larger than that time point, the dynamic equations should 
also hold for the survival function. The Burr family of distribution functions 
has this precise property [van den Broek and Heesterbeek (2007)]. 

When detection of a symptomatic infected individual occurs, he/she usu- 
ally will be removed immediately. Thus, it is reasonable to assume that the 
reproductive power and the removal rate have a similar structure, and follow 
a similar survival function. 

The most well-known and useful distribution from the Burr family is the 
Burr XII, or Singh-Maddala distribution, which in the literature is some- 
times referred to simply as the Burr distribution. The survival function is 
given by 



S(t) 



1 + 



t > 0,a,b, q > 0. 



The right tail is governed by the parameters a and q, the left tail by a, and 
b is the scale parameter [Kleiber and Kotz (2003), page 198]. To reduce the 
number of parameters to be estimated, one can consider three special cases 
of the Burr distribution [Kleiber and Kotz (2003)]: 

1. The logistic form is obtained for q = l, giving the log- logistic or the Fisk 
distribution. 

2. For a = 1, the Burr distribution is reduced to the Lomax (Pareto type II) 
distribution. 
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3. The case a = q is also known as the para-logistic distribution. 



The Weibull distribution and the Pareto distribution are limiting cases of 
the Burr distribution [Shao (2004)]. An interesting way to arrive at the Burr 
distribution is to assume that the times follow a Weibull distribution, the 
scale parameter of which follows an inverse generalized gamma distribution 
[Kleiber and Kotz (2003)]. 

As another way to reduce the number of model parameters, and to find a 
further relationship between the reproductive power and the removal rate, 
one may rewrite the Burr distribution as a proportional rate or an acceler- 
ated event-time distribution (just as in the case of the more famous Weibull 
distribution). Suppose that the survival function for the reproduction time 
is a Burr distribution with parameters a, b\ and q and suppose that the sur- 
vival function of the removal time is also a Burr distribution with parameters 
a, 62 and q. If we replace 62 by db\ (where d is a constant), we get 



Therefore, the survival distribution for the removal times is exactly the same 
as that for the reproduction time, except that the removal time is interpreted 
as accelerated reproduction time. 

To employ the proportional rate model, let the survival distribution for 
the reproduction time be a Burr distribution with parameters a, b and q±, 
and suppose that the survival distribution of the removal times is also a 
Burr distribution with parameters a, b and q2- If we replace q2 by cq\ (where 
c is a constant), we get 



-52 



a-i — cqi 



1 + 



0-1 -51 



{Sx(t)} c , 



indicating that the rates are proportional; that is, the rate at which an 
infected-and-detected is removed is proportional to the rate at which an 
infected-and-detected individual reproduces. Of course, the Burr distribu- 
tion can also be written as both an accelerated event-time distribution and 
as a proportional rate distribution, that is, S^t) = [S\(t')] c . 

All the distributions described above can be written in accelerated event- 
time form and in proportional hazard form, except for log-logistic distribu- 
tion which has only an accelerated event-time form. In the next section we 
fit all of those models to epidemic data of avian influenza A (H7N7) in the 
Netherlands, 2003. 



5. Dutch avian influenza A (H7N7) epidemic in 2003. Here we show an 
example of our model application to an observed data set. An epidemic 
of avian influenza A (H7N7) virus started on February 28, 2003, in the 
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Gelderse Vallei in the Netherlands. In total, 239 flocks experienced infection 
with known detection date. Control measures taken include movement re- 
strictions, stamping out of infected flocks, and preemptive culling of flocks in 
the neighborhood of infected flocks. As a result, 1255 commercial flocks and 
17,421 flocks of smallholders had to be depopulated, and approximately 25.6 
million animals were killed. The virus was also transmitted to humans who 
had been in close contact with the infected chickens, resulting in one human 
death. Further details can be found elsewhere [Stegeman et al. (2003)]. 

We examine transmission and detection events between flocks. We regard 
the detection date of a case (i.e., infected individual) as the date at which 
there were first signs of infection in a flock. In other words, the detection 
date of an infected individual is regarded as the birth date in our model. 
Therefore, the birth date is not the date of infection but the date at which an 
infected farm is detected, which is used as a surrogate. Moreover, the date of 
depopulation is regarded as the death date. Consequently, the Dutch data 
consist of the prevalence of infected-and-detected (but not yet removed) 
flocks on each epidemic date. 

Figure 1 shows the temporal distribution of the prevalent cases (repre- 
senting those who were born and have not been removed yet). As can be 
seen, the right tail contains gaps and the center of the distribution is not 
well determined. Usually these make it difficult to fit simple models to the 
data. 

The Burr model with three parameters for birth rate and three for death 
rate was fitted to the observed data. We refer to it as the full Burr model. 
To objectively show how this model fits to the data better than other types 
of models, we compared its likelihood with that of the full inverse Burr 
(Burr III or Dagum) model. The inverse Burr model is also known as a 
flexible distribution from the Burr family and can be viewed as a generalized 
gamma with a scale parameter that follows an inverse Weibull distribution 
[Kleiber and Kotz (2003)]. The inverse Burr yielded an AIC of 363.7, while 
the full Burr model yielded 342.4. We thus examined the full Burr model 



Table 1 

Akaike's information criterion (AIC) for different models 





Full model 


Acc. event time 1 


Prop, rate 2 


Both acc. event and prop, rate 


Burr 


342.4 


345.90 


346.4 


342.7 


Log-logistic 


371.3 


369.8 






Lomax 


341.9 


344.8 


344.9 


same as full model 


Para-logistic 


351.8 


354.0 


357.5 


same as full model 



1 Accelerated event-time model. 
2 Proportional rate model. 
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6 13 21 29 37 45 53 61 69 
Days 

Fig. 1. Temporal distribution of the prevalent cases of avian influenza A (H7N7) epi- 
demic in the Netherlands, 2003. The index case was reported on February 28, 2003 (and 
the date is defined as day 0). The prevalent cases represent those who have been infected 
and detected but have not been depopulated yet at day t, which correspond to the expected 
value of y t in Section 3. See Stegeman et al. 2003 for further information. 

(and its special cases) for further analyses. The AICs for these different 
models are in Table 1. The full Lomax model gave the best fit, although the 
difference in AIC with the full Burr distribution was not particularly large. In 
other words, the information criterium suggest that both the death rate and 
the reproductive power may be proportional and that the removal time may 
be accelerated reproduction time in the observed data set. Figure 2 visually 
confirms a good fit of the full Lomax model to the observed number of 
infected-and-detected individuals based on the conditional model in discrete 
time. The model seems to have well captured the observation, because fitting 
prevalence yt j to the data is conditioned on yt J _ 1 - It should be noted that 
the predicted values in Figure 2 reflect a qualitative pattern of the observed 
data always one or two steps late, which is a general tendency of conditional 
fit. 

Parameter estimates for the best fitting model are shown in Table 2 
with their standard errors. The logarithm of the acceleration factor, d, 
was estimated at 1.47 with a standard error of 0.369, and the logarithm 
of the proportionality between the reproductive power and the removal rate 
was 1.54 with a standard error of 0.355. The estimates of mean reproduc- 
tion and removal times for the Lomax distribution can be calculated from 
[Kleiber and Kotz (2003)] E(t) = ^~j^r~~^ ■ The mean reproduction time 
was estimated as 1.81, indicating that it takes on average 1.8 days for a de- 
tected case to reproduce another detected case. The mean removal time was 
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3 6 9 12 16 20 24 28 32 36 40 44 48 52 56 60 64 68 72 76 

Days 



Fig. 2. Comparison of the observed numbers and the predicted values from the condi- 
tional model of prevalent cases of avian influenza A (H7N7) epidemic in the Netherlands, 
2003. The index case was reported on February 28, 2003 (and the date is defined as day 
0). Observed data (bars) is compared with the predicted number of cases (solid line) based 
on the full Lomax model. It should be noted that the expectation of prevalence yt- is con- 
ditioned on yt j _ 1 • 

estimated as 2.79, indicating that on average it takes 2.8 days for a detected 
infected case to be removed. 

In Figure 3 the rate at which a single case survives removal, (1 — h^tj-i)), 
is shown as a function of epidemic date. In addition, the rate at which a 
single nonremoved case reproduces secondary cases, (1/(1 — h\(tj-i)), is 
also shown in the figure. The product of these two functions jointly yields 
R(tj-i) (also shown in Figure 3). Compared with other modeling results 
[e.g., Nishiura and Chowell (2009)], our estimates of R(tj-i) are smoothed 
as a function of time owing to our parametric model for the survival func- 
tions of reproduction and removal. Nevertheless, it should be noted that our 
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Table 2 

Parameter estimates for the Lomax 
distribution 



Parameter 


Estimate 


St. error 


ln(6i) 


3.235 


0.5998 


ln(gi) 


2.712 


0.3611 


ln(6 2 ) 


4.987 


1.0396 


ln(q 2 ) 


3.980 


0.8480 



approach does not have to assume that the generation time distribution is 
known [i.e., common assumption in estimating R(t)], because our approach 
does not have to translate a growth rate of incidence to the reproduction 
number. As we discussed above, R(tj-±) < 1 suggests that the epidemic is 
in decline at time tj-i [vice versa, if R(tj-±) > 1]. This can be understood 



(a) (b) 




20 40 60 20 40 

days days 



Fig. 3. Time dependency of birth and death rates which jointly yield the effective repro- 
duction number, (a) ((1 — (ij-i) ) indicates the rate at which an infected- and- detected 
case escapes removal, whereas (1/(1 — h\(tj-\))) denotes the rate at which a single in- 
fected- and- detected (but not yet removed) case reproduces secondary cases. The product 
(1 — hfi(tj-i)) /(l — h\(tj-i)) yields the effective reproduction number R(t) as a function 
of time, which can be interpreted as the average number of secondary cases generated by 
a single primary case at time tj-i. If R(tj-i) < 1, it suggests that the epidemic is in de- 
cline. In our example, the expected value of R(tj-i) declined below unity on day 23 since 
the detection of index case, (b) The effective reproduction number, R(tj-i), calculated from 
sample paths drawn from the underlying estimated process (gray lines) with the estimated 
value (black line) and the 95% percentile lines (dashed lines). 
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by considering the condition for R(tj-\) = 1; that is, the reproductive power 
becomes equivalent to the removal rate at time t. In our example, the ex- 
pected value of declined below unity for the first time on day 23 since 
the detection of the index case, supporting eventual end of the epidemic in 
the later stage. The sawtooth at the end of the lines is considered to have 
been caused by zero prevalence during the corresponding time period (i.e., 
because of our conditional measurement, the survival functions reflect small 
variations in the observed data). Figure 3(b) shows the estimated effective 
reproduction number R{tj-\). To get an idea of the statistical uncertainty of 
R(tj-i), 500 sample paths were drawn from the estimated nonhomogeneous 
birth-death model and for each sample path the effective reproduction num- 
ber was calculated (gray lines). The 95% percentile lines of i?(ij_i) are also 
shown. 

6. Discussion. In the present study we modeled an epidemic based on 
the nonhomogeneous birth-death process, addressing some of the critical 
issues which are seen in the observation of directly transmitted infectious 
diseases. First, we modeled infected-and-detected individuals, which corre- 
sponds to observable and countable information in practice (e.g., our model 
requires neither susceptibles nor infectious individuals). Second, for a similar 
reason, the application of a birth-death process allowed the population at 
risk (i.e., the susceptible population) to vary with time. Third, applying the 
concepts of a nonhomogeneous birth-death process to epidemic modeling, 
dependent events (i.e., dependence of a single infected individual on other 
infected individuals) were addressed in the model. Fourth, our stochastic 
model offered an explicit likelihood function and yielded a standard error 
of parameters. Last, our model allowed estimation of the effective reproduc- 
tion number, R(t). Although a different probability distribution was given 
by Bailey (1964), the derivation was not given in the literature, and, to the 
best of our knowledge, equation (2.4) is the first to derive the pdf explic- 
itly. 

In a recent study van den Broek and Heesterbeek (2007) applied a non- 
homogeneous birth process to epidemic data, in which the survival-time 
distribution was modified by a final-size parameter to describe the end of 
an epidemic (which was influenced by public health countermeasures) . The 
countermeasures would not only reduce the final size of an epidemic but also 
the reproductive power as a function of time, because secondary transmis- 
sions caused by infected individuals are restricted under the control mea- 
sures. The nonhomogeneous birth process in the previous study permitted 
an explicit assessment of the time variations in the number of newly infected 
individuals (and thus the reproductive power). 

In the present study the proposed nonhomogeneous birth-death process 
further improved our understanding of time dependency by explicitly adding 
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the nonhomogeneous removal rate. Whereas the reproductive power changes 
as a function of time due to variations in susceptible individuals or in the 
transmission rate, the removal rate is also nonhomogeneous when infected 
individuals are likely to be removed upon detection. Thus, adding nonhomo- 
geneous death to a nonhomogeneous birth process enabled us to separately 
consider the effectiveness of countermeasures as a reduction in reproductive 
power (e.g., reduction in infectious contacts) and an increase in removal rate 
of infected individuals (e.g., culling of infected farms). In this way, the fad- 
ing out of an epidemic was modeled in a smoother way, as compared to the 
previous model based on a nonhomogeneous birth process alone. 

Moreover, it should be noted that our model does not necessarily require a 
homogeneous mixing assumption to describe contacts, because our assump- 
tions of the time-dependent rates implicitly include those nonhomogeneities. 
For example, our model allows time variations in susceptible individuals. 
Nevertheless, our model only accounts for the nonhomogeneity with respect 
to time in an explicit manner, and understanding other heterogeneous as- 
pects of transmission requires further information. 

Of course, there are many possible candidate distributions to model the 
reproduction and the removal times. We have selected the Burr distribution 
for three reasons: 

1. As noted in Section 4, the Burr family coincides with our analytical un- 
derstanding of the epidemic modeling, especially at the early stage of an 
epidemic. 

2. Essentially, the Burr distribution is flexible and has special and limit- 
ing cases. For instance, the distribution can be regarded as a Weibull 
distribution with a random scale parameter. 

3. In the context of the present study, the proportional hazard rate and 
the accelerated event time interpretation may be very helpful in model 
reduction and further interpretation of the data. 

As an example, the nonhomogeneous birth-death model was applied to 
epidemic data of avian influenza A (H7N7) in the Netherlands, 2003, showing 
that the model fitted to the data very well. Indeed, since the data set has 
information on both birth and death events for each individual case, the 
Dutch data appeared very useful for fitting prevalence data and applying 
our modeling method. Even in the presence of gaps in the right tail of the 
epidemic curve, and even though the center was not well determined, our 
model reasonably described the time-course of the observed epidemic. In 
particular, our model permitted an estimation of the effective reproduction 
number R(t) as a function of time without imposing a specific distribution 
of the generation time. 

As is often the case with natural outbreaks, a single observation repre- 
sents just one sample path from the process for which the above-mentioned 



16 



J. VAN DEN BROEK AND H. NISHIURA 



model is imposed as the generator. There is no random sampling of in- 
fectious disease outbreaks, and a repeated sampling interpretation for the 
resulting model fit might be difficult. In other words, the description and 
conclusions arising from analysis of a single outbreak data set is valid only 
for that outbreak. To find some general disease-specific conclusions from 
such an exercise, we stress that it is important to analyze several different 
outbreaks for the same disease. For such a purpose, one may use our model 
to accumulate the experience of applying our method to several outbreaks. 
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